Phenotypic subtypes of fibrotic hypersensitivity pneumonitis identified by machine learning consensus clustering analysis

Background Patients with fibrotic hypersensitivity pneumonitis (f-HP) have varied clinical and radiologic presentations whose associated phenotypic outcomes have not been previously described. We conducted a study to evaluate mortality and lung transplant (LT) outcomes among clinical clusters of f-HP as characterized by an unsupervised machine learning approach. Methods Consensus cluster analysis was performed on a retrospective cohort of f-HP patients diagnosed according to recent international guideline. Demographics, antigen exposure, radiologic, histopathologic, and pulmonary function findings along with comorbidities were included in the cluster analysis. Cox proportional-hazards regression was used to assess mortality or LT risk as a combined outcome for each cluster. Results Three distinct clusters were identified among 336 f-HP patients. Cluster 1 (n = 158, 47%) was characterized by mild restriction on pulmonary function testing (PFT). Cluster 2 (n = 46, 14%) was characterized by younger age, lower BMI, and a higher proportion of identifiable causative antigens with baseline obstructive physiology. Cluster 3 (n = 132, 39%) was characterized by moderate to severe restriction. When compared to cluster 1, mortality or LT risk was lower in cluster 2 (hazard ratio (HR) of 0.42; 95% CI, 0.21–0.82; P = 0.01) and higher in cluster 3 (HR of 1.76; 95% CI, 1.24–2.48; P = 0.001). Conclusions Three distinct phenotypes of f-HP with unique mortality or transplant outcomes were found using unsupervised cluster analysis, highlighting improved mortality in fibrotic patients with obstructive physiology and identifiable antigens. Supplementary Information The online version contains supplementary material available at 10.1186/s12931-024-02664-x.


Phenotypic subtypes of fibrotic hypersensitivity pneumonitis identified by machine learning consensus clustering analysis
Tananchai Petnak 1,4 , Wisit Cheungpasitporn 2 , Charat Thongprayoon 2 , Tulaton Sodsri 3 , Supawit Tangpanithandee 2 and Teng Moua 4* Background Hypersensitivity pneumonitis (HP) is an immune-mediated interstitial lung disease characterized by injury from inhaled organic or inorganic antigens [1,2].The 2020 ATS/JRS/ALAT clinical practice guideline categorizes HP into fibrotic and non-fibrotic subtypes based on radiologic or histopathologic findings [1].Patients with fibrotic hypersensitivity pneumonitis (f-HP) have worse survival compared to non-fibrotic with an all-cause mortality rate of 67.5 per 1000 person-years [3].Identification and avoidance of causative antigens has recently been described as associated with better survival in those with fibrotic disease [4].Exposure type (e.g., avian vs. mold vs. bacterial) may also be associated with differential outcomes [4].Specific radiologic findings among patients with lung fibrosis may be correlated with lower forced vital capacity (FVC) or lung function [5].Although multiple studies have reported the association of specific clinical domains with survival in f-HP, concomitant domains or phenotype analyses have not been previously described.
Machine learning and artificial intelligence have advanced the diagnostic and prognostic association of clinical parameters in medicine.Prior cohort studies have found specific variables are associated with outcome, though have not incorporated them into phenotypic subgroups or structuring.An additional benefit of phenotyping may be tailoring treatments according to subgroup characteristics, particularly in the context of heterogeneously presenting diseases like HP.Recent studies have shown that clustering methodology may differentiate unique phenotypes with distinct clinical courses or outcomes [6][7][8].We conducted a study using unsupervised machine learning to identify clinical phenotypes in f-HP and assess their comparative mortality and transplant risk.

Subject selection
This study is a single-center retrospective cohort conducted at Mayo clinic Rochester.Suspected f-HP patients diagnosed between January 2005 and December 2020 were identified using a computer-assisted search.Each medical record was reviewed by study investigators to verify exposure history, serum specific IgG testing, radiologic findings, bronchoalveolar lavage analysis, and histopathology if obtained.Patients were identified as having identifiable causative antigens if there was documentation of suspected environmental exposure regardless of serum specific IgG testing.Final diagnosis of f-HP was based on the 2020 ATS/JRS/ALAT clinical practice guideline [1] highlighting specific levels of diagnostic confidence.Diagnoses were categorized as definite (level of confidence ≥ 90%), high (80-89%), moderate (70-79%), or low confidence (51-69%).Patients with diagnostic confidence < 50% or missing baseline pulmonary function testing (PFT) were excluded.Our study was approved by Mayo Clinic Institutional Review Board (approval No. 20-000211).

Data collection
In addition to diagnostic variables, age, sex, smoking status, body mass index (BMI), presenting PFTs as percent predicted findings for total lung capacity (TLC%), forced vital capacity (FVC%), forced expiratory volume in the first second (FEV1%), FEV1/FVC ratio, diffusion capacity for carbon monoxide (DL CO %), and selected comorbidities (see Table 1) were collated.Missing non-PFT data were imputed by the Random Forest method [9].Radiologic findings included presence of mosaic attenuation, honeycombing, and those with probable or consistent usual interstitial pneumonia (UIP) high resolution computed tomography (HRCT) patterns.Dates of death, LT, or last follow-up were used to assess long-term outcomes.

Clustering analysis
We used an unsupervised machine learning consensus clustering approach to identify clinical subtypes of patients with f-HP [10].A pre-specified subsampling parameter of 80% with 100 iterations was pursued.The number of potential clusters (k) was set to a range of two to ten to avoid excessive cluster numbers and clinically irrelevant groupings.The optimal number of clusters was determined by a consensus matrix (CM) heat map, cumulative distribution function (CDF), clusterconsensus plots in the within-cluster consensus scores, and proportion of ambiguously clustered (PAC) pairs.The within-cluster consensus score, an average consensus value for all pairs of individuals in the same cluster, ranged between 0 and 1 [11].A value closer to 1 indicated better cluster stability.PAC, ranging between 0 and 1, was defined as the proportion of all sample pairs with consensus values falling within the predetermined boundaries [12].A value closer to zero indicated better cluster stability [12].Additional details of consensus clustering algorithms are described in the Supplementary file.

Statistical analysis
After cluster identification, we compared baseline characteristics between each cluster using analysis of variance (ANOVA) and Chi-square for continuous and categorical variables, respectively.The standardized mean differences of clinical characteristics between each cluster and the whole cohort was used to determine specific clinical characteristics for each cluster.Variables with an absolute standardized mean difference of > 0.3 were considered key characteristics of the cluster.Association of each cluster with transplant-free survival was evaluated using Cox proportional hazard regression analysis reported as a hazard ratio (HR) with 95% confidence interval (CI).Survival status and lung transplantation were ascertained through medical record review and cross-matched with a United States Social Security Death Index (USSDI) search.Since all baseline characteristics were considered for cluster development, we did not adjust for specific variables in the model.P values of < 0.05 were considered statistically significant.All analyses were performed using R, version 4.0.3(RStudio, Inc., Boston, MA, USA), with the ConsensusClus-terPlus package (version 1.46.0) for consensus clustering analysis and the missForest package for imputation of missing data [9].

Results
Of 779 patients with suspected f-HP evaluated between January 2005 and December 2020, 448 were compatible with f-HP based on 2020 ATS/JRS/ALAT guideline.Seventy-one and forty-one patients were excluded respectively for diagnostic confidence < 50% and missing baseline PFTs.A total of 336 f-HP patients were included in the final analysis (Fig. 1) with a mean age of 65.3 ± 10.9 years.Approximately half were male and had a history of smoking.Definite diagnosis of f-HP was confirmed in 133 (49.6%) with causative antigen exposures identified in 60% of the total cohort.
Consensus clustering analysis was applied to the final set of f-HP patients meeting inclusion criteria.A CDF plot provides the consensus distributions for each cluster (Fig. 2A).A delta area plot shows relative change in the area under the CDF curve (Fig. 2B).The greatest changes in area were identified between k = 3 and k = 5.As shown on the CM heatmap (Fig. 2C, supplementary Figs.1-9), the ML algorithm identified cluster 3 with distinct borders, demonstrating high cluster stability across repeated iterations.The mean cluster consensus score was highest for three clusters (mean consensus score of 0.90) (Fig. 3A) with favorable low PACs demonstrated for cluster 3 (Fig. 3B).Overall, consensus clustering analysis identified three clinically distinct phenotypes.
Of the 336 f-HP patients, 158 (47.0%), 46 (13.7%), and 132 (39.3%) were classified into clusters 1, 2, and 3, respectively.Baseline characteristics of the three clusters are presented in Table 1.Variables differing among the three included age, BMI, diagnostic confidence, causative antigen identification, baseline PFT findings, and OSA as a comorbidity.The standardized mean difference plot was used to identify key clinical characteristics of each cluster, as presented in Fig. 4.
Cluster 1 were more likely to have preserved pulmonary function defined by only slightly decreased mean FVC (78.2%predicted) and TLC (78.4%predicted), despite being the oldest of the three clusters in terms of age at presentation (mean age 68 ± 9.7 year).Mean DL CO was 55.8% of predicted, comparable to cluster 2 but significantly higher than cluster 3. Cluster 2 had lower mean age (60.9 years) and BMI (27.5 kg/m 2 ) with more causative antigen identification (84.8%), particularly to avian and hot tub exposure.PFT findings were also more obstructive with air trapping, lower mean FEV 1 /FVC ratio (0.69), FEV 1 (57.8%predicted), and FEF 25 − 75% (44.5%predicted).Higher mean RV (139.9%predicted),TLC (90.8%predicted), and RV/TLC (151.8%predicted) were also found compared to the other two clusters.Cluster 3 had more severe restriction, with lower mean FVC (51.1%predicted),RV (68.6%predicted),TLC (59.2%predicted), and DL CO (40.5%predicted).Characteristics of the entire cohort and each cluster are presented in Fig. 4; Table 1.
Treatment details are presented in Table 1.With respect to therapeutic interventions, patients in Cluster 2 were more likely not to receive treatment of any kind (26%), including corticosteroid and steroid-sparing agents.Significantly higher antigen avoidance was also observed in this cluster (50%).Fig. 3 (A) The bar plot displays the mean consensus score for different numbers of clusters, where k ranges from two to ten.Each colored bar within a specific number represents an individual cluster from separate clustering simulations.This iterative approach was adopted to evaluate stability and consistency of the clustering results.(B) The PAC values assess ambiguously clustered pairs Of those in cluster 1, 53 (33.5%) died and 12 (7.6%)underwent lung transplantation.In cluster 2, 10 (21.7%) died and 2 (4.3%) underwent lung transplantation.In cluster 3, 60 (45.5%) died and 11 (8.3%) underwent lung transplantation.When compared to cluster 1, risk of lung transplantation or death was significantly lower for cluster 2 (hazard ratio (HR) 0.42; 95% CI, 0.21-0.82;P = 0.01), and significantly higher for cluster 3, (HR 1.76; 95% CI, Fig. 4 The standardized mean difference plot identifying clinical characteristics of each cluster 1.24-2.48;P = 0.001).Kaplan-Meier survival curves for the three clusters are presented in Fig. 5.

Discussion
Phenotypic characterization resulting in prognostic or differential outcomes has not been previously described in patients with f-HP.Individual clinical parameters have been reported as relevant to predicting outcome (exposure history, lung function, and radiologic findings), though such findings may be heterogenous or present variably among diverse sets of patients [4,5,13].A cluster algorithm approach may identify groups of similar patients using a wide-ranging set of clinical characteristics [6].A primary advantage of cluster analysis is the potential discovery of new or unexpected disease patterns which may not be intuitive or difficult to characterize due to multifaceted or overlapping presentations.In this study, an unsupervised ML consensus clustering algorithm identified three distinct clusters of f-HP patients based on presenting findings.Key features of each cluster were highlighted by pulmonary function and causative antigen exposure history, despite the inclusion of multiple clinical variables and comorbidities in the analysis.Importantly, the three clusters translated to separate transplant-free survival in the setting of typical treatment or antigen avoidance strategies.
Cluster 1 accounted for most of the f-HP patients included in our cohort (47.0%).Patients in this group had mild restrictive pulmonary physiology with slightly decreased mean FVC and DL CO , despite older age at presentation.Mortality or transplant outcomes were observed on average after ten or more years of followup.Higher pulmonary function may represent earlier diagnosis, though the subsequently longer survival seen here may represent slower progression or better response to subsequent antigen avoidance or treatment.Similarly, Cluster 3, characterized by more severe restrictive physiology, may also represent more advanced or late-stage disease despite younger age at presentation, as f-HP may present at any age.Baseline FVC and DL CO have been previously described as outcome predictors in f-HP [14,15].Notably, UIP HRCT pattern (6 vs. 8%) and honeycombing (18 vs. 21%) were found with similar frequency between the two groups.
Our study found patients in Cluster 2 were uniquely characterized by obstructive physiology on PFTs.The impact of obstruction on outcome or its relation to other clinical characteristics remains unclear in patients with f-HP.Obstruction may be seen in HP as an acute or earlier manifestation of small airways involvement.Mosaic attenuation or expiratory air trapping, typical HRCT findings in f-HP, may also represent small airway involvement with physiologic obstruction [16].Zuniga et al. found patients with f-HP had improvement in likely small airway-related obstruction, as characterized by a decrease in the phase 3 slope of ultrasonic pneumography, after immunosuppressive treatment [17].Obstructive physiology might represent active and potentially reversible small airways inflammation or injury responsive to therapy, and perhaps better survival.Patients in cluster 2 were also younger, had lower BMI, and higher rates of identifiable causative antigen, particularly to avian or hot tub exposure.Younger age, identifiable causative exposure, and antigen avoidance have been previously reported as associated with improved mortality [4,13,18].Our study confirms findings from a previous report demonstrating better survival in patients with history of avian antigen exposure [13].Since exposure to avian antigens or hot tubs is often easier to identify and avoid, such patients might also have better outcomes.Additionally, compared to clusters 1 and 3, mosaic attenuation occurred more frequently.Honeycombing was also found in 11%, with none having typical or probable UIP HRCT patterns.
As discussed, cluster analysis not only identifies distinctive presenting characteristics inherent to a particular group but may also derive guidance for tailoring appropriate treatment according to associated disease progression or survival outcome.Our study found that patients in Cluster 2 had more favorable outcomes with nearly 30% abstaining from any medical treatment.In contrast, patients in Cluster 3 experienced worse survival despite nearly all receiving initial corticosteroids and half going on to long-term steroid-sparing agents.The earlier use of antifibrotics when meeting criteria for progressive pulmonary fibrosis (as suggested for Cluster 3) may be an appropriate treatment strategy.
Our study has several limitations.First, selection bias is possible with the use of a single tertiary referral center and patients evaluated over a decade or more of clinical experience.Despite the systematic use of recent international consensus criteria to align diagnostic uncertainty, historical practices and their evolution over time may limit the availability of all clinical parameters.Original multidisciplinary team discussions were not documented for all patients; however, extensive clinical, HRCT, and pathological reports were available for defining diagnostic confidence levels according to the 2020 ATS/ERS/ JRS/ALAT guideline.Excluded patients who did not have baseline PFTs (N = 41) were also younger with a higher proportion of 'definite' HP diagnostic confidence levels (supplement Table S1), which might impact the current analyses if included.While a broad range of clinical variables were included in the cluster analysis, there may still be unaccounted or unknown factors that may impact or change current phenotypic characterizations, including timing of symptom onset.Finally, an all-cause mortality endpoint may not entirely represent the direct impact of disease progression from f-HP but contribution from other unrelated comorbidities or complications.We attempted to account for this with the inclusion of selected comorbidities in the clustering model, of which none appeared to be distinguishing.

Conclusions
We identified three distinct phenotypes of f-HP using an unsupervised machine learning consensus clustering approach.These three clusters, as characterized by pulmonary function testing (mild vs. more severe restriction vs. obstruction) and identifiable antigen exposure history, translated to unique transplant-free survival outcomes.

Fig. 2 (
Fig. 2 (A) CDF plot displaying consensus distributions for each K.Each color represents a specific number of clusters.(B) Delta area plot (x-axis (k) signifies the number of clusters).The plot demonstrates relative changes in area beneath the CDF curve with increasing numbers of clusters.(C) Consensus matrix heat map depicting consensus values on a white to blue color scale of each cluster

Fig. 5
Fig. 5 Kaplan-Meier survival curves comparing transplant-free survival among each cluster

Table 1
Baseline characteristics of fibrotic hypersensitivity pneumonitis patients as classified by cluster